Deep Reinforcement Learning with Regularized Convolutional Neural Fitted Q Iteration
نویسنده
چکیده
We review the deep reinforcement learning setting, in which an agent receiving high-dimensional input from an environment learns a control policy without supervision using multilayer neural networks. We then extend the Neural Fitted Q Iteration value-based reinforcement learning algorithm (Riedmiller et al) by introducing a novel variation which we call Regularized Convolutional Neural Fitted Q Iteration (RCNFQ) that incorporates convolutional neural networks similarly to the Deep Q Network algorithm (Mnih et al) and dropout regularization to improve generalization performance. Finally, we present an implementation of the algorithm and discuss several extensions.
منابع مشابه
Deep Reinforcement Learning for Flappy Bird
Reinforcement learning is essential for applications where there is no single correct way to solve a problem. In this project, we show that deep reinforcement learning is very effective at learning how to play the game Flappy Bird, despite the high-dimensional sensory input. The agent is not given information about what the bird or pipes look like it must learn these representations and directl...
متن کاملDeep Belief Nets as Function Approximators for Reinforcement Learning
We describe a continuous state/action reinforcement learning method which uses deep belief networks (DBNs) in conjunction with a value function-based reinforcement learning algorithm to learn effective control policies. Our approach is to first learn a model of the state-action space from data in an unsupervised pretraining phase, and then use neural-fitted Q-iteration (NFQ) to learn an accurat...
متن کاملCS229 Final Report Deep Q-Learning to Play Mario
In this paper, I study applying applying and adjusting DeepMind’s Atari Deep Q-Learning model to train an automatic agent to play the 1985 Nintendo game Super Mario Bros. The agent learns control policies from raw pixel data using deep reinforcement learning. The model is a convolutional neural network that trained through only raw frames of the game and basic info such as score and motion.
متن کاملAn Empirical Comparison of Neural Architectures for Reinforcement Learning in Partially Observable Environments
This paper explores the performance of fitted neural Q iteration for reinforcement learning in several partially observable environments, using three recurrent neural network architectures: Long ShortTerm Memory [7], Gated Recurrent Unit [3] and MUT1, a recurrent neural architecture evolved from a pool of several thousands candidate architectures [8]. A variant of fitted Q iteration, based on A...
متن کاملLearning to Play the Worker-Placement Game Euphoria using Neural Fitted Q Iteration
We design and implement an agent for the popular worker placement and resource management game Euphoria using Neural Fitted Q Iteration (NFQ), a reinforcement learning algorithm that uses an artificial neural network for the action-value function which is updated off-line considering a sequence of training experiences rather than online as in typical Q-learning. We find that the agent is able t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016